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Abstract 

This  paper  atfresses  the  problems  in 
providing  graphic  displays  automatically  to  serve 
a  user  naive  with  respect  to  ootputer  graphic 
devices.  It  identifies  the  properties  of  data 
that  affect  graphic  representation  and  presents  a 
formalism  in  which  to  view  thos.  It  also 
discusses  and  Illustrates  the  selection  of  various 
graphic  formats  based  on  the  data  to  be 
represented,  its  properties,  and  graphic  device 
characteristics. 

I.  Problqn  Statement 


Broadly  speaking,  there  are  three  phases  of 
using  computers*  acquiring,  processing  and 
presenting  information.  As  bo  the  first  two,  many 
years  of  research  and  development  have  led  to  the 
availability  of  efficient  ways  of  collecting  and 
processing  data.  However,  methods  of  presenting 
Information “ are  by  and  large  limited  to  variations 
of  tabular  form.  Beading  a  sequence  of  lines  and 
understanding  their  import  is  s  tedious  job 
though,  reminding  people  of  the  old  prwerb, 
picture  is  worth  a  thousand  words/  As  s  result, 
efforts  art  now  being  directed  towards  presenting 
such  data  graphically.  Unfortunately,  using 
graphic  devioss  can  be  a  conplex  process, 
requiring  days  or  even  weeks  of  training.  Op  to 

*  This  research  Is  partially  supported  by  DAKPA 
grant  IMM903-80-C-0093. 


now,  it  has  been  almost  iapossible  for  s  naive 
user  to  create  a  graphic  display  to  view 
information. 

Our  long  range  goal  is  to  have  an  intelligent 
system  helping  users  in  the  graphical  display  of 
data,  performing  the  task  of  a  graphic  artist. 
Our  objective,  at  present,  is  to  facilitate 
automatic  display  of  information  by  providing 
reasonable  defaults  for  graphical  representations 
and  easy  user  modification  of  the  resulting 
displays. 

The  major  problem  in  developing  such  a  system 
is  that  there  is  a  gap  between  the  way  a  user 
conceives  of  s  graphic  display  and  the  way  the 
machine  doss.  Bor  the  user,  it  is  a  meaningful 
picture  mads  vp  of  certain  particular  pieces)  for 
the  machine,  it  is  the  sequence  of  operations 
needed  to  crests  such  s  dl^lay.  A  second  problem 
is  that  a  user  will  not  think,  to  make  explicit 
what  s/he  does  not  care  about  or  trtiat  s/he 
believes  the  systoi  already  knows  or  is  able  to 
infer.  What  is  needed  is  a  graphic  eiqmrt  system 
that,  on  the  one  hand,  is  at  an  appropriate 
conceptual  level  for  user  to  state  things  that 
s/he  cares  about,  but,  on  the  other,  prtxrides 
appropriate  defaults  to  take  care  of  everything 
else* 

Research  has  proven  that  graphic  presentation 
of  information  is  better  than  tabular  form. 
Tabular  form  merely  presents  raw  data  without 
interpretation  (Gens  Stlasny,  1972) ,  whereas 
pictorial  form  conveys  the  relationship  between 
the  data  items. 


to  Illustrate  this  can  treat  between  tabular 
and  *aphic  presentation,  consider  the  foliating 
eun pte.  Using  the  Harvest  aystmi  (Harvest, 
1979|  •  a  database  query  ays  tan,  a  naive  user  can 
type  hi 


YEN*  -  1990  DISPLAY  BUDCZT 


a  formatted  cxitput  as  shown  below: 
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EUX2T  FCR  1990 


1.  SALARIES 
X  TRAVEL 
X  BOUIMNT 
9.  HADOOtANGE 
X  MI9QBUANIQUS 


*  Thousands  of  dollars 
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Ho  existing  eyata*  provides  the  default  graphical 
formats  needed  to  provide  such  a  service. 

There  are  acne  "high  level"  software  packages 
ocvanercially  available,  auch  as  VUOT-IO  and  D1SP1A 
[ISSOO] ,  that  allow  an  applications  programmer  to 
use  a  graphic  device  at  a  programming  language 
level.  Interactive  systsma  like  Tell-e-Graf 
(ISSOO)  requires  users  to  enter  data  and  specify 
their  preferences  completely.  But  none  of  these 
systems  can  provide  default  displays  for  either 
completely  or  inocapletely  ^mcified  choices. 
Nhst  la  needed  is,  highly  automated  graphics 
system  to  mset  the  needs  of  naive  users  who 


form  output,  lyitsw  such  as  HARVE9T 
la  default  formats,  this  relieves  e 
of  the  need  to  provide  detailed  format 
one,  a  burdensome  task  especially  when 
ft  not  care  more  about  the  format  than 


wmg  to  read. 

■ever,  it  is  not  currently  possible  to 
a  ^ophic  display  in  the  eosm  easy  teems  - 


i.e.,  am  i 


•  1990  DUPLAY 


and  gat  a  graphic  display  as 


either  do  not  want  to  mpedfy  any  preferences 
about  the  graphic  diqpiey  or  give  incomplete 
specifications. 

This  paper  discusses  appropriate  defaults  for 
those  aspects  of  s  display  the  user  has  failed  to 
opacify  and  how  thorns  defaults  depend  on  three 
factors:  the  data  to  be  di flayed,  the  device  on 
which  it  is  to  be  displayed  end  the  users  it  is 
displayed  for.  Two  different  types  of  defaults 
are  considered:  defaults  affecting  the  choice  of 
graph  through  which  to  display  the  date  and 
defaults  affecting  the  Choiae  of  "attributes*  for 
that  graph,  sue*  as  color,  else,  orientation, 
order  end  other  factors.  These  defaults  ere  used 
to  provide  s  naive  user  with  the  ability  to  see 
his  or  her  numeric  defea  (which  would  otherwise  be 
presented  as  s  table  of  maters)  in  the  form  of  s 
pic  chart,  bar  graph  or  tienda  graph* 


tegama^gry  * 


II*  Definitions 


Before  introducing  the  system  and  basic 
assuiptions  for  the  system,  we  shall  define  the 
concepts  we  will  be  using: 

1.  CONTINUITY:  a  boolean  value  that 

represents  whether  or  not  the  mothers  of 
on  ordered  set  represent  an  interval  of  a 
continues  with  respect  to  the  given 
ordering.  Exanple:  A  set  of  days, 
(Sunday,  Monday,  Tuesday,  Wednesday, 
Thursday,  Friday,  Saturday)  could  be 
defined  to  represent  a  WEEK,  an  interval 
of  time,  and  have  the  property 
continuity,  while  {Sunday,  Tuesday, 
Saturday)  may  not,  and  {Sunday,  Tuesday, 

Friday,  Wednesday,  Monday,  Saturday, 

Thursday)  may  not. 

2.  TOTALITY:  is  a  boolean  value  that 

represents  whether  or  not  the  ambers  of 
a  set  represent  ALL  the  cxupnent  parts 
of  an  object  or  an  abstract  concept. 
Exanple:  the  set  of  items  {Salaries, 

Travel,  equipment.  Maintenance, 
Miscellaneous)  could  be  defined  to 

represent  the  parts  of  which  BUDGET  is 
ocRposed,and  have  the  property 

totality.  The  subset  {Salaries,  Travel, 
Maintenance)  would  not  have  totality. 

3.  CARDINALITY:  is  the  limber  of  elements 
in  s  set.  Bonplei  the  cardinality  of 
range  the  set  of  days  is  7. 

4.  MUOTrUCITYi  is  the  umber  of  values 
*  assigned  to  eabh  element  in  s  dcnaln  set 

by  s  mapping,  tablet  the  mapping 
"square  root*  from  real  timbers  into 
ooAplss  umbers  has  the  miltlolicity  of 
2. 


S.  WITS:  is  the  set  of  labels  specifying 
the  unit  of  measurement  associated  with 
each  numerical  value.  Exanple: 
Thousands  of  dollars.  Hundreds  of  tons, 
etc. 


One  of  the  factors  upon  which  effective 
automatic  data  display  depends  oanpriaes 
particular  characteristics  of  the  data  itself.  By 
abstracting  out  these  characteristics,  one  can 
form  a  well  defined  bijection  mapping  that  can 
help  one  to  understand  the  canplex  phenomenon  of 
data  and  its  manipulations. 

Let  this  abstract  form  of  data  be  represented 
by  the  word  title,  a  mapping  from  the  domain  set 
of  labels  into  the  range  set  of  quantities.  That 
is, 

U *  us:  Ui'lg*  •  ■ •••  (qll  ,qi2#  #  *  *  # 

*q21*q22****#q2n^  •••••  ^ 

where,  for  every  i-1  to  m,  lj  Is  the  ith  element 
In  the  domain  set  and  for  every  i-I  to  m  and  i«l 
to  n,  q| j  is  the  jth  component  of  the  Ith  tixrte  In 
the  range  set*  Each  col  inn  also  has  an  entity 
called  imits  and  another  ENTITY  called 
col  imn»  label. 

In  other  words,  the  data  in  the  range  set  is  a 
matrix  of  sise  m  rows  and  n  ooltnrm. 

The  cardinality  of  row-labels  and 
miltiplicity  of  the  mapping  cmn  be  derived  from 
input  data.  However,  two  additional  properties  of 
this  moping,  that  are  necessary  to  select  s 
di^&sy  format  are  not  directly  derivable  from  the 
input  data  itself.  These  ares 
(!)  tdmther  elements  of  either  row-labels  or 
ooliwn-labala  form  component  parts  of  some  whole 
with  respect  to  the  quantities  represented  by  each 
eenber  of  the  oolvnt>- labels  and  row-labels 

respectively:  that  is,  whether  either  set  bee 
totality* 
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(11)  whether  element a  of  titter  cow-labeim  or 
oofonn-labels  denote  to  a  aontinuua  with  respect 
to  the  quantities  represented  fay  each  aenter  of 
the  colom-labels  and  row-labels  respectively: 
that  is,  whether  either  set  tea  continuity.  For 


NET  DOM!  PER  9MB 
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In  this  example,  the  row-labels  are  1972*  1973, 
1974,  1975,  1976,  1977,  1978  and  1979  and  the 
oolmrt-labela  are  GGMPNHr-1  and  OQMEANT-2.  the 
continuity  of  row-labels  could  be  true  or  false 
with  respect  to  each  oolum-label.  If  the 
comparison  of  inoanes  for  two  companies  over  the 
period  of  tine  is  prefered,  then  the  continuity  of 
row-labels  would  be  (true,  true}  with  respect  to 
each  of  the  coliwn-labela.  If  an  absolute 
caparison  of  Inoones  la  prefered  then  the 
continuity  of  row-labels  would  be  {false, false) . 
the  totality  of  row-labels  could  be  true  or  false. 
If  s  relative  aosperiaan  of  each  year's  insane 
with  respect  to  the  total  incase  of  each  oonpany 
ia  prefered,  then  the  totality  of  row-labels  would 
be  {true,  true}}  otherwise.  It  would  be 
{false, false).  Similarly,  the  continuity  and 
totality  could  be  defined  for  oolvjwn-labels.  the 
cardinality  of  row-labels  is  8.  the  miltiplicity 
of  the  sapping  is  2.  tha  wits  are  dollars  for 
each  oolum-label. 


III.  Examples 


Having  defined  the  concepts  that  we  will  be 
using,  to  demonstrate  how  the  above  mentions* 
ideas  can  be  used  to  pc  wide  a  graphic  display, 
consider  the  BUDGET  TOR  1980  example  given 
earlier.  Here  the  mapping  is  BUDC2.T  FOR  1980,  the 
set  of  row-labels  is  (Salaries,  Travel,  Equipment, 
Maintenance,  Miscellaneous},  the  range  set  of 
quantities  is  {(35),  (10),  (25),  (18),  (12)},  the 
set  of  column-labels  is  (Amount)  and  the  set  of 
units  Is  {thousands  of  dollars)  •  Let  the  totality 
and  continuity  of  row-labels  be  (True)  and  (False) 
respectively.  Given  this  information  and  no 
preferences  on  the  user's  part,  the  system's  task 
is  to  observe  the  data  and  its  characteristics, 
decide  what  type  of  graphic  format  is  both 
suitable  and  feasible  with  respect  to  the  graphic 
device  that  is  available,  decide  its  attributes 
and  then  display  the  picture.  (Although  it  should 
also  allow  the  user  to  modify  the  resulting 
display,  this  aspect  of  the  user  Interface  will 
not  be  discussed.)  For  this  exmrple,  the  system 
selects  a  pie  chart  representation  to  express  the 
totality  of  the  row-labels.  This  pie  chart 
representation  is  an  appropriate  choice  as 
confirms*  in  the  literature: 

"Because  a  circle  gives  such  a  clear  impression  of 
being  a  total,  a  pie  chart  is  ideally  suited  for 
the  one  purpose  it  serves  -  showing  the  relative 
sixes  of  the  ocsponents  of  seme  whole."  * 
[Zelazny,  1972) 


"...the  separation  of  s  whole  Mount  in  terms  of 
its  cxmpocwnt  quantities.  In  the  graphic  figure, 
s  circular  form  can  be  used  to  represent  s  tdiole 
amount,  and  can  be  divided  into  sequent*  which 
represent  proportional  quantities,  or  percentages, 
of  the  whole."  •  [Bataan,  1968)  • 


As  *•  noted  above,  the  user  has  not  stated 
any  preferences  regarding  the  display.  this  being 
the  oast,  the  choice  of  whether  or  not  to  color 


yjFv  - 


the  user.  Fbc  a  device  such  as  the  printed  page 
the  choice  of  colors  is  black  and  whits. 


Cut  3.  If  the  continuity  of  roe-labels  is  {true, 
false),  the  graphic  focaat  selected  would  be 


Case  4.  If  the  continuity  and  totality  of 
row  labels  are  {false, false}  and  {true,  true) 
respectively,  then  the  data  would  be  presented  In 
the  fora  on  the  top  right. 


Case  5.  If  the  continuity  and  totality  of 
row-labels  are  {false,  false)  and  (false,  false); 
and  the  totality  of  oolisn-lahala  Is  {true,  true, 
true,  true,  true,  true,  tne,  true),  thn  the 
graphic  focaat  on  the  faottn  right  represents  #  the 
liput  data. 


CsnipI 
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IV.  System  Overview 

the  overview  at  the  proposed  system  currently 
seder  development  Is  given  In  the  following 
figure. 


amnia  ct  be.  ^ 


Me  are  making  the  following  three  asswptione 
with  respect  to  this  system  designs 
(1)  eaxa  is  expected  from  an  existing  database, 
the  system  expects  a  table  of  Information  which 
has  both  row-labels  and  oolman-labele.  Either  of 
these  sets  may  be  tagged  with  the  properties  of 
continuity  and/or  totality,  these  two  properties 
of  the  mapping  are  expected  as  input  to  the  systoe 
slang  with  the  data  mapping  and  information  on 
sits  of  measurement  for  the  quantities  in  the 
range  set. 

ffti)  DEVICE  is  expected  to  have  a  mat  of  routines 
far  drawing  and  erasing  points,  lines  and 
characters,  and  for  setting  colors  or  gray  values. 
Ctil)  0091  is  expected  to  be  able  to  type  in  the 
regiest  for  a  graphic  display. 

The  information  from  a  database  enters  the 
qmsa  at  the  node  HQUT  DATA.  the  data  is  passed 
tm  the  next  node  FORMAT  SUCTION. 


Depending  on  the  characteristics  of  input 
data  such  as  multiplicity,  cardinality,  units, 
continuity  and  totality?  and  of  graphic  device 
such  as  device  type,  spatial  and  Intensity  or 
color  resolution?  a  default  graphic  format  (such 
as  a  pie  chart)  will  be  selected  to  display 
information.  these  rules  of  selecting  a 
particular  display  format  are  defined  after 
consulting  Bertin  (1973),  Bowman  (1968)  and  Gene 
Zelazny  (1972  and  1980)  and  studying  various 
graphic  representations. 

Once  the  appropriate  graphic  format  has  been 
selected,  the  format  and  the  information  to  be 
displayed  are  passed  to  the  next  node,  the 
ATTRIBUTE  SUCTION.  This  state  consults  the 
device  knowledge  and  dona  in  specific  knowledge  to 
determine  the  attributes  of  the  display  such  as 
color  and  loons,  the  output  of  this  state 
consists  of  data  and  device  parameters. 

Depending  upon  these  parameters,  the  next 
node,  GRAPHIC  PROCEDURES,  generate  the  graphic 
carroands  to  a  particular  device  that  realizes  the 
display. 

DISPLAY  is  the  actual  display  of  information, 
the  final  output  of  the  system,  in  the  graphic 
format. 

the  graphic  display  is  obtained  by  sixply 
requesting  the  systma  to  present  tabular 
information  graphically.  If  the  display  is  not 
satisfactory  to  the  user,  it  may  be  modified,  the 
modifications  are  provided  at  three  levels:  (1) 
input  data  oould  be  modified  by  selecting  or 
graying  the  row-labels  to  be  displayed,  (ii)  the 
properties  ouch  as  totality  or  continuity  oould  be 
changed  thereby  changing  the  format  of  the  diq?lay 
and  (ill)  attributes  of  display  oould  be  changed. 
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V.  Sjtpmltv 


In  unary,  this  paper  has  discussed  the 
«y*tu  IQwnawgari,  1980)  which  we  have  designed 
to  provide  afpropciat*  defaults  for  tteee  aspects 
***  Presentation  of  a  user's  data  that  h/he 
either  does  not  care  abcut  or  assmes  the  systoa 
"obviously*  infer.  The  underlying 
structures  of  input  daU  have  been  studied  and 
abstracted  and  relevant  properties  of  data  have 
been  recognized.  A  reasonably  large  set  of 
graphic  formats  have  been  defined  tor  presenting 
data.  Currently  we  are  working  on.  knowledge 
representation  Issues  of  the  systot. 
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